Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning

Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning

  • Downloads:9070
  • Type:Epub+TxT+PDF+Mobi
  • Create Date:2021-04-26 11:58:43
  • Update Date:2025-09-07
  • Status:finish
  • Author:Benjamin Bengfort
  • ISBN:1491963042
  • Environment:PC/Android/iPhone/iPad/Kindle

Summary

The programming landscape of natural language processing has changed dramatically in the past few years。 Machine learning approaches now require mature tools like Python's scikit-learn to apply models to text at scale。 This practical guide shows programmers and data scientists who have an intermediate-level understanding of Python and a basic understanding of machine learning and natural language processing how to become more proficient in these two exciting areas of data science。

This book presents a concise, focused, and applied approach to text analysis with Python, and covers topics including text ingestion and wrangling, basic machine learning on text, classification for text analysis, entity resolution, and text visualization。 Applied Text Analysis with Python will enable you to design and develop language-aware data products。

You'll learn how and why machine learning algorithms make decisions about language to analyze text; how to ingest, wrangle, and preprocess language data; and how the three primary text analysis libraries in Python work in concert。 Ultimately, this book will enable you to design and develop language-aware data products。

Download

Reviews

Craig Nicol

A great step by step overview of a variety of text analysis techniques, taking the reader from beginner through to complex analyses using Spark and Sci-Kit

Jevgenij

This book has a very good first half and very bad second half。 I'd prefer less code examples and more explanations of _why_, but still first chapters are very approachable, explain things in simple terms with plenty of examples。 The second half is just overly complicated, many terms and strategies are left unexplained or barely explained。 This book has a very good first half and very bad second half。 I'd prefer less code examples and more explanations of _why_, but still first chapters are very approachable, explain things in simple terms with plenty of examples。 The second half is just overly complicated, many terms and strategies are left unexplained or barely explained。 。。。more

Suhrob

I don't understand the high ratings here。The book focuses mostly on old approaches: stuck mostly in NLTK, with only bits of spacy, gensim。Vector representations are mentioned only briefly。 The author is sceptical about this new whipper-snapper technology called deep learning, and gives you only a few pages of the simplest keras implementation。If it was 2015 I would understand。 This book is from 2018。5。Also it would be OK if the aim was to explain all the basics of tokenization, PoS tagging, lemm I don't understand the high ratings here。The book focuses mostly on old approaches: stuck mostly in NLTK, with only bits of spacy, gensim。Vector representations are mentioned only briefly。 The author is sceptical about this new whipper-snapper technology called deep learning, and gives you only a few pages of the simplest keras implementation。If it was 2015 I would understand。 This book is from 2018。5。Also it would be OK if the aim was to explain all the basics of tokenization, PoS tagging, lemmatization etc。 but all is handled very superficially。 You won't get much understanding here unless you've heard about them elsewhere。Most space is dedicated on bending the NLTK and sklearn APIs to work together。。。 The "practical" examples seemed quite shallow, unfounded and unclear (I liked the gender analysis in the first part of the book。 It is downhill from there)。I DID like the parallelization and Spark parts though! So at least something。。。 。。。more

Theo

Find it really hard to follow through the chapters。 The code chunks are in bits and pieces and find it really hard to put it together。 The book spends great effort in their own data ingestion engine。 Nothing wrong about that。 But if you attempt to use/install, that is when the trouble starts。 I find the content covered is really good and wish the authors have made the code easy to follow/execute。

Alexis Idlette-Wilson

Lots of code examples and detailed explanation about how analytics could solve a variety of business problems。 Well written and a good reference

Mahmoud Rabie

I really faced a hard time reading the book, the book contains too much amount of text that I think can be shorten and presented in better ways, also the book didn't present the code in a good format, just chunks of code written here and there and you need to keep following up these linesThe book code located on GitHub still has a lot of issues and won't run with fixing them - The book still in the early release phase)Some chapters were complex for me and I didn't get much of them (i。e。 Context- I really faced a hard time reading the book, the book contains too much amount of text that I think can be shorten and presented in better ways, also the book didn't present the code in a good format, just chunks of code written here and there and you need to keep following up these linesThe book code located on GitHub still has a lot of issues and won't run with fixing them - The book still in the early release phase)Some chapters were complex for me and I didn't get much of them (i。e。 Context-Aware Text Analysis, Text Visualization)Also I was expecting that the book will focus more on the analysis part, but I found that big part from the book wasted on building Corpus readers and other stuff not related to "Analysis"I was expecting the book to focus more on the feature extraction part and the vectorization part in details and with enough code samples 。。。more